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We initiate the study of zero-error communication via quantum channels when the re¬ 
ceiver and sender have at their disposal a noiseless feedback channel of unlimited quan¬ 
tum capacity, generalizing Shannon's zero-error communication theory with instantaneous 
feedback. 

We first show that this capacity is a function only of the linear span of Choi-Kraus op¬ 
erators of the channel, which generalizes the bipartite equivocation graph of a classical 
channel, and which we dub "non-commutative bipartite graph". Then we go on to show 
that the feedback-assisted capacity is non-zero (allowing for a constant amount of activating 
noiseless communication) if and only if the non-commutative bipartite graph is non-trivial, 
and give a number of equivalent characterizations. This result involves a far-reaching ex¬ 
tension of the "conclusive exclusion" of quantum states [Pusey/Barrett/Rudolph, Nature 
Phys. 8(6):475-478, 2012], 

We then present an upper bound on the feedback-assisted zero-error capacity, motivated 
by a conjecture originally made by Shannon and proved later by Ahlswede. We demon¬ 
strate this bound to have many good properties, including being additive and given by 
a minimax formula. We also prove a coding theorem showing that this quantity is the 
entanglement-assisted capacity against an adversarially chosen channel from the set of all 
channels with the same Choi-Kraus span, which can also be interpreted as the feedback- 
assisted unambiguous capacity. The proof relies on a generalization of the "Postselection 
Lemma" (de Finetti reduction) [Christandl/Konig/Renner, Phys. Rev. Lett. 102:020504, 
2009] that allows to reflect additional constraints, and which we believe to be of indepen¬ 
dent interest. This capacity is a relaxation of the feedback-assisted zero-error capacity; how¬ 
ever, we have to leave open the question of whether they coincide in general. 

We illustrate our ideas with a number of examples, including classical-quantum channels 
and Weyl diagonal channels, and close with an extensive discussion of open questions. 


a A preliminary version of this paper was presented as a poster at QIP 2012,12-16 December 2011, Montreal. 
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I. ZERO-ERROR COMMUNICATION ASSISTED BY NOISELESS QUANTUM FEEDBACK 

In information theory it is customary to consider not only asymptotically long messages but 
also asymptotically vanishing, but nonzero error probabilities, which leads to a probabilistic the¬ 
ory of communication characterized by entropic capacity formulas Ifl4l l44l. It is well-known that 
when communicating by block codes over a discrete memoryless channel at rate below the ca¬ 
pacity, the error probability goes to zero exponentially in the block length, and while it is one 
of the major open problems of information theory to characterize the tradeoff between rate and 
error exponent in general, we have by now a fairly good understanding of it. However, if the 
error probability is required to vanish faster than exponential, or equivalently is required to be 
zero exactly (at least in the case of finite alphabets), we enter the strange and much less under¬ 
stood realm of zero-error information theory 11371 1451, which concerns asymptotic combinatorial 
problems, most of which are unsolved and are considered very difficult. There are a couple of ex¬ 
ceptions to this rather depressing state of affairs, one having been already identified by Shannon 
in his founding paper 1145] , namely the discrete memoryless channel N(y\x) assisted by instanta¬ 
neous noiseless feedback, whose capacity is given by the fractional packing number of a bipartite 
graph T representing the possible transitions N(y\x) > 0. The other one is the the recently con¬ 
sidered assistance by no-signalling correlations t2Q] , which is also completely solved in terms the 
fractional packing number of the same bipartite graph T. 

Recent years have seen attempts to create a theory of quantum zero-error information the¬ 
ory m, identifying some rather strange phenomena there such as superactivation Ill8l l22l or 
entanglement advantage for classical channels Ifl9l 13911 , but resulting also in some general struc¬ 
tural progress such as a quantum channel version of the Lovasz number f23l . Motivated by the 
success in the above-mentioned two models, two of us in |24] (see also 1(251 1 have developed a 
theory of zero-error communication over memoryless quantum channels assisted by quantum 
no-signalling correlations, which largely (if not completely) mirrors the classical channel case; in 
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particular, it yielded the first capacity interpretation of the Lovasz number of a graph. Some of 
the techniques and insights developed in l24l will play a central role also in the present paper. 

In the present paper, we take as our point of departure the other successful case. Shannon's 
theory of zero-error communication assisted by noiseless instantaneous feedback. In detail, con¬ 
sider a quantum channel A f ■ C{A) —> £(5), i.e. a completely positive and trace preserving 
(cptp) linear map from the operators on A to those of B (both finite-dimensional Hilbert spaces), 
where C{A) denotes the linear operators (i.e. matrices) on A, with Choi-Kraus and Stinespring 
representations 


N{p) = Y J E jP E] = TA c VpV\ 
j 

for linear operators Ej : A —> B such that £ j Ej = t, and an isometry V : A —> B ® C, 

respectively. The linear span of the Choi-Kraus operators is denoted by 

K = KL{N) := span{Sj : j} < C{A —> B ), 

where "<" means that K is a subspace of C{A —> B ), the linear operators (i.e. martrices) mapping 
A to B. We will discuss a model of communication where Alice uses the channel n times in 
succession, allowing Bob after each round to send her back an arbitrary quantum system. They 
may also share an entangled state prior to the first round (if not, they can have it anyway from 
the second round on, since Bob could use the first feedback to create an arbitrary entangled state). 
Their goal is to allow Alice to send one of M messages down the channel uses such that Bob is 
able to distinguish them perfectly. More formally, the most general quantum feedback-assisted code 
consists of a state (w.l.o.g. pure) \4>) e X (j <g) Yq ^ n d for each message m = 1,..., M isometries for 
encoding and feedback decoding 

Ul m) : X t _! ® F t -i ~^A t ® X t , 

W t : Yf-i <S> B t —> F t (g) Y t , 

for t = 1 ,,n and appropriate local quantum systems Xt (Alice) and Y t (Bob), as well the 
feedback-carrying systems F t ; see Fig. [T[ For consistency (and w.l.o.g.), Fq = F n = C are triv¬ 
ial. Note that Bob can use the feedback channel to create any entangled state \ f) with Alice for 
later use before they actually send messages. We use isometries, rather than general cptp maps, 
to represent encoders and decoders in the feedback-assisted communication scheme, because by 
the Stinespring dilation Ii48l . all local cptp maps can be "purified" to local isometries. Thus every 
seemingly more general protocol involving cptp maps can be purified to one of the above form. 
We will find this form convenient in the later analysis as it allows us to reason on the level of 
Hilbert space vectors. 

We call this quantum feedback-assisted code a zero-error code if there is a measurement on Y n 
that distinguishes Bob's output states (> <rn) = t pf l) , with certainty, where the sum is over the 
states 



Trx„ 11 ^ 
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which are the output states given a specific sequence j = ji... j n of Kraus operators. [Note 
that here and below, for convenience, we use nj=n Qt to represent right-to-left multiplications 
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FIG. 1. Diagrammatic representation of a feedback-assisted code for messages m sent down a channel Af 
used n times, in the form of a schematic circuit diagram. All boxes are isometries (acting on suitably large 
input and output quantum registers), and the solid lines and arrows represent the “sending” of the respective 
register. Bob’s final output state p m after n rounds of using the channel and feedback is in register Y n . 


of operators Qt, namely W\ =n Qt '■= Qn‘“ Qi-\ I n other words, these states have to have 

mutually orthogonal supports, i.e. for all m / m!, all j, k and all £ E C[X n ), 


0 = 


n , 1 

m m,) 4^mw t E ktU rm =■■ 


t =i 


t=n 


By linearity, we see that this condition depends only on the linear span of the Choi-Kraus operator 
space K, in fact it can evidently be expressed as the orthogonality of a tensor defined as a function 
of \4>), the ljj' n} and H), to the subspace {K -cf. similar albeit simpler characterizations of 
zero-error and entanglement-assisted zero-error codes in terms of the "non-commutative graph" 
S = K^K := span {E^Ej : k,j} < C(A) Ifl8li22, 23], and of no-signalling assisted zero-error codes 
in terms of the "non-commutative bipartite graph" K 1241 . Thus we have proved 
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Proposition 1 A quantum feedback-assisted code for a channel AT being zero-error is a property solely of 
the Choi-Kraus space K = JC(J\f). The maximum number of messages in a feedback-assisted zero-error 
code is denoted Mf(n ; K). Hence, the quantum feedback-assisted zero-error capacity of N, 

Cqef{K) := lim — log Mf(n\ K) = sup — log Mf(n;K), 

n KX) 77/ fi Tt 


is a function only of K. □ 

In the case of a classical channel N : X — > y with transition probabilities N(y\x), assisted 
by classical noiseless feedback, the above problem was first studied - and completely solved - 
by Shannon Ii45l . To be precise, his model has noiseless instantaneous feedback of the channel 
output back to the encoder; it is clear that any protocol with general actions (noisy channel acting 
on the output) by the receiver can be simulated by the receiver storing the output and the encoder 
getting a copy of the channel output, if shared randomness is available. Our model differs from 
this only by the additional availability of entanglement; that this does not increase further the 
capacity follows from |20| , see our comments below. 

Following Shannon, we introduce the (bipartite) equivocation graph T = T(IV) on X x y, which 
has an edge xy iff N(y\x) > 0, i.e. the adjacency matrix is T(y|x) = \N(y\x)}; furthermore 
the confusability graph G = G(N ) on X, with an edge x ~ x' iff there exists a y such that 
N(y\x)N(y\x') > 0, i.e., iff the neighbourhoods of x and x' in T intersect. The feedback-assisted 
zero-error capacity Cof{N) of the channel N can be seen to depend only on F. 

Note that for (the quantum realisation of) a classical channel, i.e. 

■AZ’(p) = N (y\ x )\y)( x \p\ x )(yl 

xy 


the corresponding sub space is given by 

I\ = span{|y)(x| : xy is an edge in T}, 


so K should really be understood as the quantum generalisation of the equivocation graph (a 
non-commutative bipartite graph) [f24l . much as S = K' K was advocated in [[231 as a quantum 
generalisation of an undirected graph. 

Shannon proved 


Cqf(n) = c 0 F (r) 


0 if G is a complete graph (iff Co (IV) = 0), 

log a* (T) otherwise. 


(3) 


Here, a*(T) is the so-called fractional packing number of F, defined as a linear programme, whose 
dual linear programme is the fractional covering number 114211451 : 

ex (r) == max ^ ^ V- x s.t. Vx 0 ^ E w x T(y\x ) < 1, 

X x ^ 

= min v y s.t. Vy 0 < v y , Vx E« ^r(y|x) > 1. 
y v 

This number appears also in other zero-error communication problems, namely as the zero-error 
capacity of the channel assisted by no-signalling correlations |2Q], There, it is also shown to be 
the asymptotic simulation cost of a channel with bipartite graph T in the presence of shared ran¬ 
domness. This shows that for a classical channel with bipartite graph F, interpreted as a quantum 
channel J\f with non-commutative bipartite graph K, Cqf(T) = Cqef(K). 
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The first case in eq. (J3]> of a complete graph G is easy to understand: whatever the parties do, 
and regardless of the use of feedback, any two inputs may lead to the same output sequence, 
so not a single bit can be transmitted with certainty In either case. Shannon showed that only 
some arbitrarily small rate of perfect communication (actually a constant amount, dependent only 
on T) is sufficient to achieve what we might call the activated capacity Cqe(N), which is always 
equal to loga*(T). This was understood better in the work of Elias B28l who showed that the 
capacity of zero-error list decoding of N (with arbitrary but constant list size) is exactly log a* (T). 
Thus a coding scheme for N with feedback would consist of a zero-error list code with list size 
L and rate R > (l — loga*(T) — O (f) for n uses of the channel N, followed by feedback in 
which Bob lets Alice know the list of L items in which he now knows the message falls, followed 
by a noiseless transmission of log L bits of Alice to resolve the remaining ambiguity. Shannon's 
scheme |[45l is based on a similar idea, but whittles down the list by a constant factor in each 
round, so Bob needs to update Alice on the remaining list after each channel use. The constant 
noiseless communication at the end of this protocol can be transmitted using an unassisted zero- 
error code via the given channel N (at most log L uses), or via an activating noiseless channel. 

The dichotomy in eq. (|3jl has the following quantum channel analogue (in fact, generalization): 


Proposition 2 For any non-commutative bipartite graph K = JC(Af) < C(A —>- B), the feedback-assisted 
zero-error capacity of K vanishes, Coef(K) = 0, if and only if the associated non-commutative graph 
is complete, i.e. S = K^K = C(A), which is equivalent to vanishing entanglement-assisted zero-error 
capacity, Cqe(S ) = 0. 

Proof Clearly Cqef{K) > Cqe{S ) since on the right hand side we simply do not use feedback, 
but any code is still a feedback-assisted code. Hence, if the latter is positive then so is the former. 
It is well known that if S f Y(A), then Cqe{S ) > 1 > 0, in fact each channel use can transmit at 
least one bit 1221123 1. 

Conversely, let us assume that Coe(S) = 0, i.e. S = K^K = C{A). We will show by induction 
on t that for any two distinct messages, w.l.o.g. b = 0,1, Bob's output states after t rounds, [)} h] on 
Yt, cannot be orthogonally supported, meaning Mj(n: K) = 1. Here, 

P? = E TrX tFt Wfl.jMfl.jJ’ with 
jl-jt 
1 

o ; jt Jt ) = n w i E jM b) \f) ex t ®F t ® Y t . 

i—t 

This is clearly true for t = 0 since at that point Alice and Bob share only | <f)x 0 Y 0 , hence p^' 1 = 
Pq ] = Tr Xo For t > 0, let Bob after t — 1 rounds have one of the states p^lg, by the induction 
hypothesis, p < f} l / p[ 1 \ - by a slight abuse of notation meaning that the supports are not orthog¬ 
onal, or equivalently that the operators are not orthogonal with respect to the Hilbert-Schmidt 
inner product. This means that there are indices j\ ... jt- 1 and /;•] ... k t -i such that 



This can be expressed equivalently as 


Tr Y t _x / 0. 
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Now, in the t- th round, Alice applies the isometry uj b> : X t -iF t -\ —> X t A to the X and F registers 
of hence for | ip^) = \<fi[ b 2i) (as we do not touch the Y t -\ register) 

Try^ |^ (0) )(^ (1) | = Try t _ 1 U t (0) )f ± 0. (5) 

After that, the channel action consists in one of the Choi-Kraus operators Ej : A —>■ B. Let us 
assume, with the aim of establishing a contradiction, that Bob's states after the channel action 
were orthogonal, i.e. for all j and k, 


Tr Xt Ej^E) A Try, E k ^ e\. 

In other words, for all j, k and operators £ on X t , 

0 = { 4 1 ) \Z®ElE j <g>l\ 4 o) ) 

= Tr[(^ElE j )Tr Yt _ 1 \4\4% 

But since £ is arbitrary and the E^Ej span C{A), this would imply Tiy ( _, = 0, contra¬ 

dicting ([5]). 

Thus, applying now also the isometry W/ : BY y _ 1 —> F t Y t , we find that there exist jt and kt 
such that 



hence 



and so finally pf^ / p± \ proving the induction step. □ 

Motivated by G'of of a classical channel 051, see above, we define also feedback-assisted codes 
with n channel uses and up to b noiseless classical bits of forward communication. The setup is 
the same as in eq. <[T|) and Fig. [l] with n + b rounds, n of which feature the isometric dilation V 
of AT, and b the isometry V' : |i) y- |i)|i) (i = 0,1) corresponding to the noiseless bit channel 
id 2 : p 4 5Ji=o |f)(*|p|f}(*l- ft is clear that the output states can be written in a way similar to 
eq. Q, and that the maximum number of messages in a zero-error code depends only on n, b and 
K < C(A —> B), which we denote M^ b {n; K). Clearly, K) = Mf(n;K) and in general, 

Mj hh] (n: K) > 2 Mj h (n\ K ). Furthermore, it can easily be verified that 

2 ~ b Mj b {n- K) 2 ~ C M+ C (m; K) < 2 - b ~ c M+ b+c (n + m; K), 
hence we can define the activated feedback-assisted, zero-error capacity 

Cqef{K) := sup sup — (log Mf b (n; K) — b ) 

b n Tl 

= sup lim — log Mt b (n; K). 

b n —>-00 n J 


Then the above Proposition [2]can be rephrased as 


Cqef(K) 


Coef(K) itS = KU<f£(A), 

0 if S’ = C{A) (iff Cqe(S) = 0), 


( 6 ) 


motivating our focusing on Cqef{E) from now on 
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The rest of the present paper is organized as follows: In Section |TT] we start with a concrete ex¬ 
ample showing the importance of measurements "conclusively excluding" hypotheses from a list 
of options, and go on to show several concise characterizations of nontrivial channels, i.e. those 


for which Cqef(K) > 0. In Section III we first review a characterization of the fractional packing 


number in terms of the Shannon capacity minimized over a set of channels, which then motivates 
the definition of C mm e(B) obtained as a minimization of the entanglement-assisted capacity over 
quantum channels consistent with the given non-commutative bipartite graph. C m]n e{K) repre¬ 
sents the best known upper bound on the feedback-assisted zero-error capacity We illustrate the 
bound by showing how it allows us to determine Cqef{K) for Weyl diagonal channels, i.e. K 
spanned by discrete Weyl unitaries. We also show that Cmm# (IT) is the ordinary (small error) ca¬ 
pacity of the system assisted by entanglement, against an adversarial choice of the channel (proof 


in Appendix 
in Appendix 
future work. 


A] based on a novel Constrained Postselection Lemma, aka "de Finetti reduction", 
BI. After that, we conclude in Section IV with a discussion of open questions and 


II. CHARACTERIZATION OF VANISHING CAPACITY C 0E f (K) 


In this section, we will prove the following result. 


Theorem 3 If the non-commutative bipartite graph K < C(A —>■ B ) contains a subspace \/3) <g> < K 

with a state vector \(3) € B, meaning that the constant channel Afo : p |/ 3 )(/ 3 | Trp has /C(A/o) < I\, 
then Coef(K) = 0; we call such K trivial. 

Conversely, if K is nontrivial, then Cqef(K) > 0. 

Proof ("trivial zero capacity") We show the stronger statement K) = 2 b for all n and 

b. Indeed, as the zero-error condition is only a property of K, we may assume a concrete constant 
channel Mq with /C(A/o) = \[i) & A' < K. The outputs of the n copies of Nq in the feedback code 
do not matter at all as they are going to be j3® n , which Bob can create himself. Hence the only 
information arriving at Bob's from Alice is in the b classical bits in the course of the protocol. But 
even assisted by entanglement and feedback, Alice can convey at most b noiseless bits in this way, 
due to the Quantum Reverse Shannon Theorem 0. □ 


The opposite implication ("nontrivial =>• positive capacity") will be the subject of the remainder 
of this section. We will start by looking at cq-channels first - Subsection IIA for pure state cq- 
channels. Subsection IIB for a mixed state example and Subsection IIC for general cq-channels -, 


before completing the proof for general channels in Subsection IID 


A. Pure state cq-channels 

For a given orthonormal basis {|z)} of the input space A, and pure states \'ipi) in the output 
space, consider the cq-channel 


N{p) = \A)(i\p\i)(H, 

l 


with Kraus subspace 


K := 1C(J\f) = span{|-i/Q(i|}. 


We shall demonstrate first the following result: 
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Proposition 4 For a pure state cq-channel, Cqef{K) is always positive unless K is trivial, which is 
equivalent to all \ , ip l ) being collinear, i.e. K = \ip) <g> for some pure state \ip). 

Proof If I\ is trivial, then the above proof of the sufficiency of triviality in Theorem [3] shows 

Coef(K) = 0 . 

Conversely, if K is non-trivial, then there are two output vectors, denoted |'i/’o) and \ that 
are not collinear, and we shall simply restrict the channel to the corresponding inputs 0 and 1. I.e., 
we focus only on K' = span{|L>o){0|. |Ci }{1|} , and the corresponding channel 

N'(p) = IV’o)(0|H0)(V’o| + |V’i}(l|p|l)(V’i|- 

Consider using it three times, inputting only the code words 001, 010 and 100. This gives rise to 
output states 

K) = IV ; o)IV , o)IV , i}, 

\ub) = |V’o)|'4i)|V’o), 

K) = IV’i)IV’o)IV ; o), 

which have the property that their pairwise inner products are all equal: (u x \u y ) = |(ho|Ci)| 2 =: e. 
By using the channel 3 n times, Alice can prepare the states 

\t x ) = \u x ) 0n (x = a, b, c), 

whose pairwise inner products are all equal and indeed e n , i.e. arbitrarily close to 0. Now, if n is 
large enough (so that e n < \), there is a cptp map that Bob can apply to transform 


\ta) >- 


14)- 

-^(|2) + |0» 

14} — 



(This follows from well known results on pure-state transformations, see e.g. IflTl .) By now it 
may be clear where this is going: Bob measures the computational basis and overall we obtain a 
classical channel P : {a, b, c} {0,1, 2} with exactly one 0-entry in each row and column: 

P(0\a) = P(l\b) = P(2\c) = 0, 

which has zero-error capacity 0, but assisted by feedback and a finite number of activating noise¬ 
less bits, it is log § m. We conclude that Cqef{N) > ^ log | > 0. □ 


B. Mixed state cq-channel 

To generalize the previous treatment to mixed states, let us first look at a specific simple exam¬ 
ple: Let | f>i) (i = 0,1, 2) be three mutually distinct but non-orthogonal states in C 3 , and define a 
cq-channel A f with three inputs i = 0.1, 2, mapping 

0 '— + 

1 1 —> 2 ^° + ^ 2 ’ ^ 

A 0 + A 1 - 


2 
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Thus, 


K = span{|pi)(0|, |^ 2 )(0|, |po)(l|, I^Xl), l^o)^, |Y>i}(2|}, 

and the most general channel J\f' consistent with this K is a cq-channel of the form 

i i— > pi , pi supported on span{ |pj) : j £ {0,1, 2} \ i}. 

We shall show how to construct a zero-error scheme with feedback, achieving positive rate, at 
least for |ip) that are sufficiently close to being orthogonal. For the zero-error properties, we may 
as well focus on J\T, which is easier to reason with. For the following, it may be helpful to think of 
eq. 0 in a partly classical way: any input i is mapped to a random \ipj), subject to j / i, so that for 
two uses of the channel, each pair i\%i is mapped randomly to one of four \Wj l )\'<P : j 2 ), with j\ / i\ , 
j -2 p v>- Of course, vice versa each of these nine vectors is reached from exactly four inputs. 

Now, assuming that the pairwise inner products of the |i pi) are small enough, i.e. 

|(V'o|V’i)|, l(V ; o|V’2)|, < e, 

to guarantee that there is a deterministic pure state transformation (by cptp map) |i) I) 1 —t 
pj U2 ) El, where 

Whh) =4 £ I 7 ) 6 C». 

V ° JiJ2erc{o,i,2} 2 

h 1=2 

On these states. Bob performs a measurement in the computational basis of the 11), and we get 
an effective classical channel mapping i\p> £ {0,1, 2} 2 randomly to some {. 71 . 72 , kjkp } = I C 
{0,1, 2} 2 , subject to the constraint 

(j 1 4 h & J 2 4 * 2 ) or (ki / ii & k 2 4 * 2 ), 

which means that each / is reached from at most eight out of the nine pairs i±i 2 ■ In fact, the 
observation of I = {, 71.72 • ^ 2 } excludes at least two out of nine input symbols, namely .71 k /2 

and k\j 2 , meaning that this classical channel has zero-error capacity (plus feedback plus a finite 
number of noiseless bits) of > log In conclusion, we achieve for A f, and hence for any A f with 
< K, a rate of > \ log | > 0. □ 


C. General cq-channels 

The above examples rely on measuring the output states pi of the cq-channel M by a POVM 
(Mj) such that the resulting classical(!) channel N : i j with N(j\i) = Tr p, Mj has an 
equivocation graph T with a* (T) > 1, because then Cqef{K) > C'of(T) = logce*(T) > 0. For 
this, cf. eq. Q, it is necessary and sufficient that each outcome j excludes at least one input i, 
i.e. N(j\i) = Tr piMj = 0, or equivalently p t T Mj. A POVM (Mj) with this property is said to 
"conclusively exclude" the set {pi} of states Jdj j41 ]. It is clearly only a property of the support 
projections P, of p,, and w.l.o.g. the POVM is indexed by the same i's, i.e. ( II ,) such that p II, = 0 
for all i, as well as II, > 0 and P* = H • 

Our approach in the following will be to characterize when a set {pi) of states, or one of its 
tensor powers {pi}® n = {pi = pp < 8 > ■ • • <S> Pi n }, can be conclusively excluded. For instance, Pusey, 
Barrett and Rudolph iPTHl showed that for any two linearly independent pure states |4o) and |u ! i), 
it is always possible to find an integer n and a 2 n -outcome POVM (R 4 : i £ {0, l} n ) such that 

TrPjJ'i/yXVyl = 0, I'lpi) = \ip h ) <gi IV’ia) <8) • • • <8) \A n )- 
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I.e. we can design a quantum measurement that can conclusively exclude the n-fold states | frf) 
with n-bit strings i = t\ ... i n as outcomes, even when |^o) and |-0i) are not orthogonal. 

We will employ the powerful techniques developed in the proof of Il24l Prop. 14], allowing 
us to show a far-reaching generalization of the Pusey/Barrett/Rudolph result BTI . The version 
we need can be stated as follows; it is adapted to a cq-channel with a-dimensional input space A 
and output states pi (i = 1,... ,a), whose support projectors are P t and supports K,, so that the 
non-commutative graph is 

K = ^2 K i ® (*| : = span{|V>i)(i| : | ipi) € Ki,i = 1, • • • , a}. 


Proposition 5 Let (P)f =1 be projectors on a Hilbert space B, with a transitive group action by unitary 
conjugation on the P u i.e. zve have a finite group G acting transitively on the labels i, and a unitary 
representation U 9 such that P\ g = (U 9 )IPiU 9 for g e G. 

Consider the isotypical decomposition ofU 9 , 


B — Q\ <8> TZ\ 

A 


into irreps Q\ ofU 9 , zvith multiplicity spaces 1Z\ (cf. PTlf . see also iH2ll32l/ ). Denote the number of terms 
A by L, and the largest occurring multiplicity by M = max^ \R\\- If now 


> 16 L 6 M 9 , 




then there exists a POVM ( Ri ) with PiRi = 0 for all i. In other zvords, any set {pi) with supp pi < Ki 
can be conclusively excluded. 


Before we prove it, we use it to derive the following general result. To state it, we need some 
notation: For a set £ = {p,}" =l of states, let 


£ 0n = {pi_ = Ph ® ® Pin : i = h ■ ■ -in G [a] n }. 

The strings i = i\...i n are classified according to type r [17), which is the empirical distribution 
of the letters it, t = 1 , ... ,n. There are only ') < (n + l) f ' many different types. The subset 
of £® n corresponding to type r is denoted 

= {pi = Ph <S) ■ ■ ■ <8> Pi n :i = h...i n has type r}. 


We also recall the definition of the semidefinite packing number [24] of a non-commutative bipar¬ 
tite graph K with support projection Pab onto the Choi-Jamiolkowski range (1 0 A')|T), where 


m = - i v |A| . 

1 ' v^T ^ l=1 


* j 


is the maximally entangled state: 


A(A') = max TV Sa s.t. 0 < Sa, Ti’^ Pab{Sa <8> 1_b) < 1_b 
= min Tr T B s.t. 0 < T B , Tv B Pab(Ha ® T B ) > t A - 

For the cq-channel case, Pab = Jf, \i){i\ A <8> P, B , this simplifies to 

A(A") := max^^Sj s.t. 0 < s*, '22 s i p i ^ 1- 

i i 

In particular, for the cq-graph K induced by projections { P t } in Proposition [5j we have 

= irV- 

II II oo 


( 8 ) 

(9) 
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Theorem 6 Let £ = {pi}f =1 be a finite set of quantum states with supports I<i = supp p u and let I< be 
the associated non-commutative bipartite graph Ki <8> (i|- Then the following are equivalent: 

i- Cqef(K) > 0; 

ii. K is nontrivial; 


iii. Hi Ki = 0; 

iv - IIEi^illoo <a ' 

v. A (AT) > 1; 

vi. For sufficiently large n and a suitable type t, the set £fv can be conclusively excluded. 

Proof i. => ii. has been shown in the first part (necessity) of Theorem[3] at the start of this section. 

ii. iii. |/3) <g> Af < K = JV Ki 0 (i\ if and only if |/3) € H,; Ki. 

iii. To. 11 J 2 i Pi IIno < EJI^IIoc = a with equality if and only if there is a common eigenvector 
|/3) with eigenvalue 1 for all of the P tr i.e. |/3) e f)i K t . 

iv. =>■ v. We check that .s t = .. 1 .. — is feasible for A (A'); indeed, 

Sip i = iiy- p.ii yy k < n , 

i IIZ -/2 oo i 

thus A (K) > , 1 ^ fl p ,, > 1. 

11 L/i * 11 oo 

v. => vi. Note that the non-commutative bipartite graph corresponding to £® n is K® n . Let's 
denote the graph of £r l> by K^ n \ In ll24l it is shown that A(AT) is multiplicative, A(/\ 0n ) = A(A') n ; 
indeed, for an optimal assignment of weights s, feasible for A (AT), .s, = .s,;, • • • s ln is feasible (and 
optimal) for A(AT (g)n ). Hence, there exists a type r such that 


A(ir‘”>) > > 

2£t 


1 

poly(n) 


A(AT- 


( 10 ) 


On the other hand, the symmetric group S n acts transitively by permutation on the strings of 
type r, and equivalently by permutation of the n tensor factors of IV. This representation is well 
known to have only L < poly(n) irreps, each of which has multiplicity M < poly(n). Thus, 
from eq. (101, we deduce that for sufficiently large n, A(A'l 7 ^) > 16A 6 M 9 , which by Proposition [ 5 ] 
implies that the set can be conclusively excluded. 

vi. =f i. By sending signals i = i\ ... i n G r and measuring the output states p, with a conclu¬ 
sively excluding POVM (Mi : i£ r), we simulate a classical channel whose bipartite equivocation 
graph T has a* (T) > 1, hence Cqef(K) > ^CoHr) >0. □ 


Proof (of Proposition |5| Assume that we have a feasible Sj = s* (i = 1,..., a) for A(A') such that 
A (K) > Si = s*a > 16A 6 M 9 . Concretely, this means that S{Pi = s* Pi < t. 

We will show that a desired POVM (A,) can be found, such that Rig = (U°y Il,LV for all i and 
g. The problem of finding the POVM (A*) then becomes equivalent to finding 0 < Ao < 1 — Ao 
such that 


1 

a 


2—1 


V Yim'Rw 

11 9SG 



a 


( 11 ) 
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Schur's Lemma l3lll tells us 


^2(u 9 )^u 9 = -'52Qx®(\, 


\G\ 


where Q x is the projection onto the irrep Q\, Ca is a semidefinite operator on 1Z X . The equality 
constraints (Tl] l on Rq are equivalent to Ca = 1L\, the projection onto P. A , for all A. 

Now, for each A choose an orthogonal basis { Z^ } of Hermitians over 7 Z\, with Zq A) = II A 

and ||Z ^ A) ||2 = 1 for // / 0. Then the operators Tr g^ Q\ <g> zjy form a basis of the U 9 -invariant 
operators, hence our constraints on Pq can be rephrased as 


0 < Rq < U — Pq, Tr Rq 




( 12 ) 


Notice that here, the semidefinite constraints on Rq leave quite some room, whereas we have 
"only" LM 2 linear conditions to satisfy Given s* satisfying the constraint of A (K), our strategy 
now will be to show that we can construct a 0 < Ro < \ (1 — Pq) such that Eqs. (12 ' hold. 

In detail, introduce a new variable X > 0, with 

R 0 = -(1 - P 0 )X(t - Pq), 


which makes sure that Ro is automatically supported on the complement of Pq- Now rewrite the 

educing the notation 

Q\®Z^\ D Xlx = {\-Po)C x ^(t-Po). 


conditions (121 in terms of X, introducing the notation 

1 


TrQ A 

This gives the new form of the constraints as 

Tr X D Xjl = 5^0- 


(13) 


Our goal will be to find a "nice" dual set {D Xfi } to the {D Xfi }, i.e. Tr D Xfl Dy^ = 5 XX >5^/, with 
which we can write a solution X = J2 Xfl <Vo T>a,/ = J2 X D X o- To this end, we construct first the 
dual set C X/I of the {Gy,}, which is easy: 


C Xtl = Q X ® z[X = 


Q x (g> n A for n = 0, 
Q x <g> z[X for // ^ 0, 


so that indeed TtG Am C' A / (U / = 5 X y5^. Now, consider the LM 2 x LAf 2 -matrix T, 

TWv = Tr D^jCyn' 

= Tr(l - P 0 )G V (11 - P 0 )G av 
= dxydnn' — A x^yfj.1, 

with the deviation 


= TvP 0 C Xfl (t - P 0 )C AV + Tr C Am P 0 C av 


Aa^avI < 2||PoC' Am ||i||C' A / /1 /||oo 

< 2||P 0 G A/i ||i = 2||P ? 0 |C' A /i||| x 

<2 v /TrP 0 |C v | A /||C AM || 1 , 


Here, 
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using ||CavI|oo < 1, the unitary invariance of the trace norm, and Lemma 7 ] stated below. Since 
\C\n\ = Ti .q a Qx <8> \%\2\ is invariant under the action of U 9 , we have TYPo Ca/z| = TrP*|(7^,| for 
all z, and using JT .s*P, < II we get 


|A V)AV | < 2 ^j^-\\C x ^\l < 2VM(s*a)~ 1 / 2 . 

With this and introducing a new parameter (3 we get that 


[r — i||oo < ||r —1|| 2 = . 22 I^a#z, 


A V 


X/xX'ix' 


< \JL 2 M A 4M (-s* o) _ 1 < i 


(14) 


(15) 


where s*a > 4f3 2 L 2 M 5 . Assuming (3 > 2 (which will be the case with our later choice), we thus 
know that T is invertible; in fact, we have T = 1 — A with || A||oo < jj < hence T _1 = J2T=o 
and so 


T~ 1 




OO 

<£l|Afe< 

00 k=l 


1 



I.e., writing T 1 = 11 + A Aa1iAV we get 


A AMiA ' M ' 


< 


A 


2 

< 

“ P 


(16) 


The invertibility of T implies that there is a dual set to {P A// } in span{C' A/i }. Indeed, from the 
definition of T AM)A / M / and the dual sets, 

= 22 Txn,x>n'Dxni which can be rewritten as 
A [i 

Dxfj, = y^(T , ~ 1 ) A / /x / iAM C A / At /. 

X'n' 


Now we can finally write down our candidate solution to Eq. ( [T3| : 

X = 22 SudDxi* 

A [i 

= 22 y^^A'/PA/^CV/V 

Xfi A ' pJ 

= 22 ^ A0 + 22 Aav'.aoC'a v 

A AAV 

= 1 + Rest. 

The rest term can be bounded as follows: 

2 2 o n 

11 Rest 11 oo <22- B = - B lM 

W'n' ^ ^ 
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using Eq. (161. Thus we find HRestHoo < 1 if (3 > 2 L 2 M 2 and s*a > 4/3 2 L 2 M 5 > 16L 6 M 9 . In this 
case, we will have 0 < X < 2 and Rq := ^(1 — Po)X(t — Pq) satisfies 


as well as 


0 < Rq < -(1 - Pq) < 1 - P 0 , 
a 


1 ^{u^RqU 9 = -a. 


|G| 


g£G 


Thus we get the desired POVM yR, t = ^ J2 gs t 09=i(U 9 )^RoU 9 j such that 

Ri = t, Ri> 0, TV PiRi = 0, 

i 

and we are done. 

Lemma 7 (Lemma 15 in l24l ) Let pbe a state and P a projection in a Hilbert space Li. Then, 

TV pP < 11 pP 111 < yfTi pP. 

More generally, for X > 0 and a POVM element 0 < E < t, 

TV XE < \\XE\\ i < VTrXVTrXE. 


□ 


□ 


We even recover the Pusey/Barrett/Rudolph result C4lll as a corollary: There, £ = {IV’o) 5 |di)} 
with (w.l.o.g.) IV’o.i) = a\ 0) ± d|l) qubit states, 1 > a > f3 > 0. We have the unitary phase action 
of Z 2 = {1, Z}, Z\vHi. 1 ) = IV’i.o)/ and hence on £ 0n we have a transitive action of G = Xlj x S n (the 
semidirect product), the symmetric group S n acting by permutation of the tensor factors and Z'.j 
as Z bt - It has L = n and M < n + 1 ||3T|| , whereas 




(2a 


2 \n 


a 


1 

2 n ’ 


Hence, for large enough n, we have that the latter exceeds 16 L G M 9 = poly(n), and then Proposi- 
tion[5]above implies that £® n can be conclusively excluded. □ 


D. General case 


We shall reduce the case of a general channel to that of a cq-channel. Indeed, recall that we 
allow Alice and Bob to share entanglement, so Alice can encode information into the Bell states 


!$«„> = (1 < 8 > Z V X U ) |$) = ( X U Z V ® 1 )|$), 

with the maximally entangled state |$) = —^ El=i N)N) ar, d the discrete Weyl operators X and 
Z (basis and phase shift). This effectively constructs a cq-channel (with a = | A|) 

M ■ [a] 2 3 uv 1 — y (id ® AT)|$^)($ UU | = {X u Z v ® t)p m {Z~ v X~ u ® t) e S(AB), (17) 


with the Choi-Jamiolkowski state poo = (id<8>AA)|<I))(4>|. Applying Theorem[6]to this channel is the 
key to obtain the following result, which in turn directly implies the reverse direction ("nontrivial 
=>• positive capacity") in Theorem |3j concluding its proof. 
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Proposition 8 A noil-commutative bipartite graph K with support projection Pab onto the Choi- 
Jamiotkowski range [t®K) |<3?) has positwe activated feedback assisted zero-error capacity, Cqef(K) > 0, 
if and only if one of the following equivalent conditions hold: 

i. I< is noil-trivial, i.e. there is no constant channel Wo with /C(Wo) < K; 

ii. There is no state \/3) e B with \/3 } <g) < K; 

hi- H-PbIIoo < \A\; 


iv. Tr a(1 — Pab ) has full rank; 


v. A (I<) > 1. 


Proof Cqef{K ) > 0 =/- i. has been shown in the first part (necessity) of Theorem [ 3 J at the start 
of this section, likewise i. o ii.. 

ii. aa iii. Pab < D a 0 H b, hence Pp, < \A\i [>, i.e. ||Pb||oo < A|. Equality is attained if and 

only if there exists an eigenvector \/3) of Pb with eigenvalue \A\, which is equivalent to \A\ = 
Tr \/3)(/3\Pb = Tr(lyi <g) \/3)(/3\)Pab- But since 1 a 0 \P)(P\ has trace |W| and Pab is a projector, this 
is equivalent to 1a <8) |/3)</3| < Pab , or again equivalently \f}) ® < I\. 

iii. aa iv. ||Pb||oo < |W| if and only if Pb = Tr^ Pab < |A|ls, if and only if Tta(1 — Pab ) > 0. 

iii. => v. Simply observe that S = | !P( | ll -i is feasible for A (K), since Tr ,i (S ® 1 )PaB = 


v. ii. We show the contrapositive: If \(3) < K, then 11 <g> 

feasible for A (K), we have 1b > Tr^jS 1 <8> 1)Pab > Ti’^jS ® 1)(1 ® 

Tr S < 1, and so A (K) = 1. 

iii. => Cqef(K) > 0. Consider the cq-channel M in eq. (17). It has output state support 


< Pab- Now, if S is 
= (TV S) |0X0|, hence 


projectors 


Puv = (. X u z v ® 1)Pab{Z- v X~ u ® 11), U, v = 1,..., a, 


and we can verify directly that ff ni . Pm = | A\1a < 8 > Pb, so its norm satisfies 


E 


P„ 


= |W|||P B || 0O <|Wp 


In other words, it satisfies the requirements of item iv) in Theorem [6j hence Cqef(K) > 

Cqef{M) >0. □ 


III. SHANNON THEORETIC UPPER BOUND ON C 0E f(K) 

In this section we will develop an upper bound on the feedback-assisted zero-error capacity via 
information theoretic ideas. For this purpose we first review the classical case, due to Shannon. 


A. Shannon theoretic characterization of the fractional packing number: 

Shannon's Conjecture 

The following characterization of the feedback-assisted zero-error capacity of a classical chan¬ 
nel was conjectured by Shannon at the end of his seminal paper Il45l , and to our knowledge proved 
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first by Ahlswede HI, in the context of his treatment of the capacity of arbitrarily varying (classi¬ 
cal) channels with instantaneous feedback, and using his very general results in that theory Our 
proof seems more direct, but then it is specially geared towards the zero-error setting. 

Proposition 9 For a bipartite graph T on X x y such that every x € X is adjacent to at least one y e y, 

log a* (r) = C min (0 := min{C(A0 : T(N) C T}, 

where C(N ) is the usual Shannon capacity of a noisy classical channel /04 jf. 

Proof The left hand side is the zero-error capacity of T, assisted by feedback (plus some finite 
amount of communication), Cof(T) l45l . From this, and the fact that feedback does not increase 
the Shannon capacity of a channel Il45l (which may also be proved invoking the Reverse Shannon 
Theorem |6]|), it follows that C(N) > loga*(T) for any eligible N, hence C OO P > loga*(T). 

There is also a direct proof of this that avoids operational arguments, relying instead only on 
elementary combinatorial notions. It goes via showing that for every eligible channel N and input 
probability distribution p, 


V(p) ■= log min — 1 < I(X : Y), 

y 22x T (y\ x )Px 


(18) 


which is enough because max p V (p) = log a* (T), while of course the maximum of I{X : Y) equals 
C(N). Now, eq. (18' is easily seen to be true for uniform distribution p x = 4t|-. Namely, with the 
equivocation sets p = {x : T(y|x) = 1} and the output probability distribution q y = Jf x p x N(y\x): 


V{p) = log \X\ — maxlog |£J 
y 

< log |Tf | -^2q y log\£ y \ 

y 

< log\X\-^qy H (X\Y = y) 

y 

= H(X) - H(X\Y) = I(X : Y), 

where we have used the fact that Px\y= y is supported on £ y , and the uniformity of the distribution 
of X. For non-uniform p, we use the method of types Ifl7l to reduce to the uniform case. In 
detail, consider the product distribution p® n and X n ~ jX n as input to the i.i.d. channel N® n . 
Introducing the type T = T(X n ) of the string X n , we have: 


nI{X : Y) = I{X n : Y n ) = I{TX n : Y n ) = I(T : Y n ) + I{X n : Y n \T ). 


On the other hand, for every type r, 

2~nV(p) = 2 -V(p ® n ) 

= max^r(y n |x- ri )p x n 

X n 

> max E r (y n \x n )p x u 

^ x n Er 

= p® n (r) max V ^-T(y n \x n ), 

qjTL f ^ 

x n Gt ' ' 
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since conditioned on T(X n ) = r, X" ~ u T is uniformly distributed. Hence, using the uniform 
case of the inequality (T8| , 

nV(P) < log + V(u T ) < log + I(X ” : Y n \T — r), 

and averaging over the different types this gives 

nV{P) < H(T) + I{X n : Y n \T ) < O(logn) + n/(X : Y), 

because there are only poly(n) many different types, and letting n —> oo we are done. 

So it remains only to show the opposite inequality. The proof uses the primal and dual linear 
programming Il5l (LP) characterisations of a*(T) to construct an optimal channel N(y\x), and in 
fact also an optimal input distribution p x , such that C(N) = I(X : Y) = log a*(T). 

Recall the fractional packing number, eq. Q, and choose optimal primal and dual solutions. 
Define an input distribution p x := . This is the one that appears in Shannon's [45]. Thm. 7], 

andhis is the same as a* (T). Now, by complementary slackness ffl5l . if M,- := ^ y T(y \x)v y > 1, 
then w x = p x = 0; per contrapositive, if p x > 0, then M x = r (y\x)v y = 1. Hence, we can define, 
for these latter x, 


N(y\x) := T(y\x)v y , 

and in general for all x, 

N(y\x) ■= —T(y\x)v v . 

This is our candidate channel, and we have to convince ourselves that indeed C(N) = 
log n*(T). First of all, let's confirm that with the above distribution p, the mutual information 
I(X : Y) equals loga*(T). Let D(p\\q) = Y1 x p( x ) YY be the relative entropy between two 
probability distributions {p x } and {q x }, cf. 1141 . Recall I{X : Y) = Yl x PxD(N(-\x)\\q), with the 
output distribution 

% = ^PxN{y\x) = p x r(y\x)v y = 


using once more complementary slackness: the equality is trivial if v y = 0, and if v y > 0 then 
^{y\x)w x = 1. In the present case, we calculate for all x, 


D(lV(.|x)|| g ) 


E 


F(y\x)v y 

M x 


log 


nv\x) 


M, 


a*(D 


log 


«*(r) 

M x ’ 


which is loga*(T) forallp iT > 0 as then M x = 1. So indeed/(A^ : Y) = loga*(T). Butwe see even 
more: While all the relative entropies D(N(-\x)\\q) with p x > 0 are equal to log a* (T), for p x = 0 
instead, 

D(N(-\x)\\q) = log a jp- < loga*(T), 

because M x > 1. These two conditions (for p x > 0 and p x = 0) are well known, classic character¬ 
izations of the Shannon capacity (cf. [16. 43][); they characterize an optimal input distribution for 
given channel N, so indeed we prove C(N) = log o:*(T j. □ 

Remark. Note that neither is C miU altered by allowing the use of entanglement as well as feed¬ 
back HQ, nor Cqf by allowing the use of entanglement and other no-signalling correlations 11201 . 
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B. Quantum generalization of the Shannon bound 


Recall that for a channel N : 5(h) —> S(B), the entanglement-assisted classical capacity 
i.e. the maximum rate of asymptotically error-free communication via many uses of the channel 
assisted by a suitable pre-shared entangled state, is given by 

C E (Af) = max/(h : B) a = max{S'( / o) + S(N(p)) — 5((id<8) Af)(j))} , 

where gab = (id <g> N)<$>aa> is the joint input-output state, 4>aa' is a purification of p, and I {A : 
B) = S(ga) + S(<jb) — S(gab) is the quantum mutual information. In the particular case above, 
we also write it I(p;Af) = S(p) + S(J\T(p)) — S'((id <g> AV)(/>). 

Using this, we define for a non-commutative bipartite graph K < C{A —>• B) such that 1 E 
K^K (these are precisely the possible Kraus subspaces of channels): 

C m inE(K) := mm{C E (AT) : K{N) < K}. 


That this is indeed a minimum follows from continuity of C E and the fact that the eligible channels 
form a compact convex set. This definition is of course motivated by Proposition [9j suggesting 
tAI<) as a possible quantum generalisation of the fractional packing number. For one thing, 
for the quantum realisation K of a classical equivocation graph T, it is easy to see that indeed 
C Vnin f,(K) = Umin (T) = log Q*(Tj, see the remark at the end of the preceeding Subsection III A 


At least, this quantity is related to the feedback-assisted zero-error capacity: Indeed, the result 
of Bowen IllOl (alternatively the Quantum Reverse Shannon Theorem |5jjZl) tells us that C E {J\f) is 
not increased even by allowing feedback, so that Cqef(K) (and actually even Cq E f{K )) is upper 
bounded by the entanglement-assisted capacity C E (N) for any channel A f such that /C(AQ < K, 
hence 


Theorem 10 Cqef(K) < C m - m E (K) for any non-commutative bipartite graph K < C{A —> B). □ 

C Vnin f,(K) shares many properties with C m i n (T), to which it reduces for classical channels. 
First, C Vnin f.(K) is given by a minimax formula (min over channels and max over quantum mu¬ 
tual information - see below) to which the minimax theorem applies, so it is also given by a 
maximin (Lemma [IT] below). Second, using this characterisation and properties of the von Neu¬ 
mann entropy, it can be shown that C m m.E is additive (Lemma [12| below). Third, thanks to the 
operational definition of C E , it can be easily seen to be monotonic under pre- and post-processing 
(Lemma [l3]below). 

We shall need some well-known mathematical properties of the quantum mutual information. 
The first is that I(p; AT) is concave in p and convex in AT, just like its classical counterpart (Tj, jSJ. 
The convexity in N follows from strong subadditivity: Let 

a AB = (id (8) (AAV« + (1 - X)M^))f p 

= ^ a AB + (1 — ^) a AB 

= Trg' a abba 

with gabb' = A g^ b <g) |1)(1 |b' + (1 - A )g^ b (g> |2)(2| S /. Then, 

l(p; AAV (1) + (1 - A)AV^) = I (A : B) a 

< I {A : BB')~ a 

= XI(A : B) aW + (1 - A )I(A : B) aW 
= XI( P] Af^) + (l-X)I(p-Af^). 
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The concavity in p can be seen as follows, using strong subadditivity again: For states p l 1 \ p <2> 
with purifications d' 1 A respectively, and 0 < A < 1, we construct a purification of the mixture 
+ (1 — A )p i - 2 \ as follows: 

Id) = V\\^)\H)a'A" + Vi^\\^)\22) A 'A'-. 

With a aa 1 A"B = (id aa'A" 0 A7)d, we have 

/(Ap (1) + (1 - A)p (2) ;A7) = 7 (AA'A" : B) a 

> I (A A! : B) a 

> I {A : B\A\ 

= \I(p^-,Af) + (1 - \)I(pW-,Af). 


Lemma 11 For any non-commutative bipartite graph K < C(A—>B), 


Cm in e(K) = min max I{p\M) 

Af s.t. p 
K(Af)<K 

= max min 7(p;AT). 

p Af s.t. 

K.(Af)<K 


Proof The first equation is the definition of Cm in f.(K), with the formula for Ce(JC) inserted. 
Above we saw that the argument I{p\j\f) is concave in the first and convex in the second ar¬ 
gument. Hence von Neumann's minimax theorem, or rather its generalisation due to Sion l47l 
applies, allowing us to interchange the order of min and max. □ 


Lemma 12 For non-commutative bipartite graphs K x < C{A X ^>- B x ) and K 2 < C{A 2 —>B 2 ), 


Cmin E (Ad 0 Ix 2 ) 

— Cmini? (Ai) + CmmE{K 2 ). 


Proof We show this by separately demonstrating "<" and "> 
two expressions for Cm in e from Lemma 11 
channels Af 2 for K\, K 2 , respectively. 

By the first expression in Lemma [Tlj 


in the above relation, using the 
In the following, choose optimal states p\, p 2 and 


Cmm f.(K x 0 K 2 ) < max/(p; A/i 0 A f 2 ) 

p 

= Ce(JC\ g) N 2 ) 

= Ce(AI i ) + Ce{ M 2 ) 

= Cmin e{K\) T Cmin E ( 77 2 ), 


using the fact that the entanglement-assisted capacity is additive, proved by Adami and Cerf 
in QJ. Note that 0 N 2 ) = /C(Ad) 0 lC(Af 2 ) < K\ 0 I< 2 . 

By the second expression in Lemma [TT] 

C m \ n e ( Ki 0 K 2 ) > min 7 {p x 0 p 2 ; M ), 

AJ S.t. 

K{AT)<K 1 ®K 2 

and we need only to show that the minimum is attained at a product channel J\f = A/i 0 M 2 with 
/C(A/i) < Ki . For this purpose, consider the state 


o‘AiA 2 s 1 s 2 = ( id Ai 0 idA 2 0 A7)(di 0 d 2 ) 
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for the purifications 4>i of p t (i = 1,2). Now observe that with respect to a, 


I(A X A 2 : B X B 2 ) - I(Ai : B x ) - I(A 2 ■ B 2 ) = S(A l A 2 ) + S(BiB 2 ) - S(AiA 2 BiB 2 ) 

- S(A!) - S(Bi) + 

- S(A 2 ) - 5(B 2 ) + S(A 2 B 2 ) 

= /(^iBi : A 2 £ 2 ) - /(Bi : B 2 ) - J(Ai : A 2 ) > 0, 


because 7(Ai : .4 2 ) = 0 and by strong subadditivity. In other words. 


I(A\A 2 : BiB 2 ) a > I(A\ : B\) ai + I(A 2 : B 2 ) a2 
= I{A\A 2 : f?iB 2 ) (Tl( g )0 . 2 , 


with the reduced states 


01 = (ta 1 b 1 = Tr a 2 b 2 0 = (id^i ® ( Tr B 2 °N)) (</>i <g> p 2 ), 
02 = 0A 2 b 2 = 0 = (idA 2 ® (Tr Bl °AT))(pi <S> <^2)- 


I.e., 


7(pi <8> P 2 ;AA) > 7(pi;Trs 2 o7\A(- <gi p 2 )) + J(p 2 ; Tr Sl oA7(pi <g> •))• 

Finally, Ttb 2 oA7(- <8> f> 2 ) is eligible: If A/" has Kraus operators E t G A'i <g) K 2 < £(AiA 2 — >B\B 2 ), 
and choosing an eigenbasis of p 2 and an arbitrary basis of B 2 , 

/C(Tr B2 oA/'(- (8) p 2 )) = span {(j\ B2 Ei\k) A2 ■ i,j,k } < K\. 

/C(Trs 1 oA/’(pi (8> •)) < A' 2 is analogous, and we are done. □ 

Lemma 13 All of Cqef, Cqef and C m \ n f, are monotonic under pre- and post-processing of the channel: 
For non-commutative bipartite graphs K < C(A -a B) and Ka < C(U -A A), Kb < £(B -a V), the 
matrix-multiplied space KbKKa < E(U -A V ) is a non-commutative bipartite graph, and 

Cqef{K ) > Coef(EbKKa), 

Cqef{K) > Cqef{KbKKa ), 

Cmin E (K) ^ CmmE (K B KK A ). 

Proof For Cqef and Cqef this follows directly from the operational definition: the pre- and post¬ 
processings may be absorbed into the input modulation and feedback-decoding, respectively, 
showing that a zero-error code for KbKKa yields one for K. 

For CYriin e, the argument is similar using the fact that Ce(AT) is the operational entanglement- 
assisted capacity of the channel A’ [§]■ □ 

We can now give yet another characterization of the feasibility of Cqef(K) > 0, adding to the 
list of Theorem[3]and Proposition [8] 

Theorem 14 For any non-commutative bipartite graph K, Cqef(K) > 0 if and only if C m \n r(K) > 0. 

Proof The only way in which C m - m E(K) can be 0 is that there is a channel A f with JC(J\T) < K 
and Ce(AT) = 0, i.e. AT has to be constant. We have seen that this is eqivalent to \/3) <8> A < I< for 
a state vector \/3) G B. But by Theorem [ 3 ] this is precisely the characterization of Cqef(K) being 
0. □ 
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To illustrate the bound of Theorem [lO} we consider the example of Weyl diagonal channels and 
the dependence on the output state geometry for cq-channels. 

Weyl diagonal channels. Denoting by X and Z the discrete translation and phase shift (which 
generate a subgroup of the unitary group of cardinality d 3 , thanks to the commutation relation 
X Z = ijjZX, oj = e 27U / d ), consider the channel 

d -1 

A f(p) = £ p ab X a Z b pZ- b X~\ 

a,b =0 

with probabilities p a b > 0 summing to 1. Clearly, 

/C(A0 = span{W a6 := X a Z b : p ab > 0}, 


i.e. this K is characterised by a subset S C Z d x Z d . It supports precisely those Weyl diagonal 
channels N with p ab = 0 for ab 0 S - and of course many channels that are not Weyl diagonal. 
First, note that A f above is Weyl-covariant: 

M{W abP wl b ) = W ab N{p)wl 

for all ab. From this, and the irreducibility of the action of the Weyl operators on C d , it follows 
that 


C E { AO = I 



21og d-H(p), 


where p = (p ab : a, b = 0, ■ ■ • , d — 1) is the probability vector. This means that for a ^-element 

S C Zrf x Zrf and K = spaniFFafc : ab G 5}, 


min Ce{ A/") = 2 log d — log k, (19) 

M Weyl-diag. 

K(N)<K 

the minimum being attained at the uniform distribution on S\ p ab = p for ab G S, and 0 otherwise. 

We will now show that 2 log d — log k is an achievable rate of zero-error communication via this 
channel when assisted by feedback (plus a constant activating amount of noiseless communica¬ 
tion). The key is the observation that if we use 

M, (p) = l Y. 

ab£S 


with dense coding, i.e. with a maximally entangled state |3>d) and sender modulation by the very 
Weyl operators W ab , the receiver making a Bell measurement in the basis (W ab ® II) 3>d), we obtain 
a generalised typewriter channel 


T : x 

T(ab\cd) = 


^d x 


^ if (a — c, b — d) G S, 

0 otherwise. 


(And choosing a different N supported by K changes only the non-zero transition probabilities.) 
T is easily seen to have fractional packing number d?/k, so its activated feedback-assisted zero- 
error capacity is 2 log d — log k. Hence Cqef{K) > 2 log d — log k, and together with eq. (19 > we 
conclude 


C 0 ef(K) = C minE (K) = C of (T) = 21og d- log A:. 



22 


Finally, this is also the minimal zero-error communication cost to simulate a channel supported 
by K (using entanglement and shared randomness), making use of an idea in [J6J: By the results 
of Wl, one can simulate T with free shared randomness at communication rate 2 log d — log k. 
Now, if in the teleportation protocol using a maximally entangled state and the Weyl unitaries 
W a b, we replace the noiseless channel of d 2 messages by this T , one simulates exactly A/q. □ 


Nontrivial dependence of Coef on the channel geometry. Consider a non-commutative bipartite 
graph corresponding to a pure state cq-channel, K = spanf | Ai)(i| }• We can see that Cqef{K) 
depends nontrivially on the geometry of the vector arrangement of the even if they are all 
pairwise non-orthogonal: Indeed, when they are close to parallel, Cqef{K) is arbitrarily close to 
0, but when they are sufficiently close to being mutually orthogonal, Cqef{K ) is arbitrarily close 
to log |A|. 

Clearly, the closer to being parallel the l^i) are, the larger the required n in the argument in 
Subsection IIA becomes, so the lower bound moves closer to 0. On the other hand, this is really 
necessary, since 


Cmin _e(A ) 


max S 

l Pi ) 



Pi\lpi){lpi | 


converges to 0 as the |-0i) get closer to being collinear. 

In the other extreme, to show that Cqef{K) —>• log A when C' rmn e{K) —> log \A\, i.e. when the 
ifji become closer and closer to being orthogonal, we use once more the ideas from Subsection II A| 
Assume that for all i / j, \(ipi\ipj)\ < e, which is a more convenient expression for C m \n f,(K) > 
log |A| - 5. 

We claim that if e is small enough, we can use K to simulate a "random superset channel" 
(cf. f20l l: for integers t < a = \A\ define the classical channel : 

\a" 


such that 


S' 


i,t : N 9 * 


J G 


^ J randomly with i G •/. 


where (^) — C [a]. |./1 = t \, the collection of all subsets of [a] with t elements. Note that 

the transition probability matrix of S '® t is given by {p(J\i )} such that 


p(J\i) = 


a — 1 
t- 1 


i G lb], J G 


Indeed, we use the characterization of IfTTI . which will show that there is a deterministic trans¬ 
formation of the set {\'ipi}} to the set {|y,;)}, with 


I < Pi ) = - r = £ \J) G 


Once this is achieved. Bob measures the states |cp*) in the computational basis, resulting in an out¬ 
put of the channel Sf t . To see this in detail, let us focus on the smallest possible case t = 2, 
for which we see that for i / j, (ipi\ipj) = n l _ t . The necessary and sufficient condition re¬ 
quired in lITTl for the existence of a cptp map transforming {\ipi)} into {|y,)} is that there ex¬ 
ists a positive semidefinite a x a-matrix M such that T = T o M, where T = [(AilA/)] and 
<I> = {{ l Pi\ t -Pj) 1 \ are the Gram matrices of the two input/output state sets, and o denotes the ele¬ 
mentwise (Hadamard/Schur) product. In other words, 

M = $o <F° -1 > 0, i.e. (a — 1) 1 T > (o — 2)1. 
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However, all eigenvalues of 'I' are lower bounded by 1 — (a — l)e, which is > as soon as 
e < ( a \\2 • I n this case, we find Cqef{K) > C'of(<S'“ 2 ) = log « — 1. Applying the same to multiple 
copies of the channel, this reasoning shows that ife<(|A|-l) 2n , thenC , 0 FF(A') > loga-□ 

We do not know whether in general Cqef equals C m ; n e or not. However, we can show that 
the latter is a genuine capacity, as per the following theorem, whose proof however we relegate 
to Appendix [Ajbecause it would detract from our principal, zero-error argument. 

Theorem 15 For any non-commutative bipartite graph K, the adversarial entanglement-assisted clas¬ 
sical capacity of K is given by C*e(K) = C m \ n f,(K). 

The definition of this capacity is as follows: An entanglement-assisted n-block code consists 
of an entangled state (w.l.o.g. pure) \4>) A ° B °, N modulation cptp maps £ t : C(Aq) —> C(A n ) (m = 
1,, N), and a POVM {D.{)f =x on BqB u . The code is said to have error e for K® n if the (average) 
error probability, 

1 N 

Pe rr(AA^) = - ^ (l - Tr((AA") o Si <g> id)0) a) , 

1=1 

is < e for every channel with KiM'P) < K® n . In this case, we call the collection (p; Si, ) an 
(n, e)-code for K® n . Denoting the largest number N of messages of an (n, e)-code as N(n, e; K ), 
the adversarial entanglement-assisted classical capacity is defined as 

C*e(K) '■= inf lim inf — log N(n, e; K). 

e>0 n^oo n 

In Appendix|A]we shall actually show that 

lim - log N(n, e; K) = C mmE {K) 

n^-oo ft 

for every 0 < e < 1 (this is known as a strong converse). There we will see that even allowing 
entanglement and arbitrary feedback in the communication protocol does not increase the capac¬ 
ity C*e(K) beyond Cm in f.(K), hence we may also address it as feedback-assisted adversarial capacity 
C*ef{K). 


IV. CONCLUSION 

We have introduced the problem of determining the zero-error capacity of a quantum chan¬ 
nel assisted by noiseless feedback. We showed that the capacity only depends on the "non- 
commutative bipartite graph" K of the channel, and that every nontrivial K has positive capacity. 

Motivated by Shannon's treatment of the classical case, we considered the minimisation of 
entanglement-assisted classical capacities over all channels with the same non-commutative bi¬ 
partite graph and proved several properties of this definition: it is an upper bound on the acti¬ 
vated feedback-assisted zero-error capacity, it is given by a minimax/maximin formula, and is 
additive. It is also shown to be equal to the adversarial entanglement-assisted capacity. 

Note that when restricting all statements above to classical channels, which are given by a 
bipartite equivocation graph T, all of these quantities boil down to the fractional packing number: 

2^min E (-^0 _ 2Cmin(r) _ 2^0 f(D _ q* (p j 
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which furthermore quantifies the zero-error capacity and simulation cost of T when assisted by 
general no-signalling correlations ||20|| . 2 c h NS 'b = 2 ,c, '°^ Ns(l * = a*(T). However, for quantum chan¬ 
nels and non-commutative bipartite graphs these notions start diverging, so none of them can be 
considered as a preferred "quantum fractional packing number": In |24l . no-signalling assisted 
zero-error capacity and simulation cost were determined for cq-channels, Co ,Ns(iO = log A(A') 
and Sq^s(K) = log£(AT), with the semidefinite packing number A (I\) and another SDP T,(K), 
and while in general (for cq-channels) 

log A (K) < C min e (K) < logE(Ji), 

both inequalities can be strict C24l . It remains an open question how Cqef(K ) fits into this picture, 
and in particular whether it is equal to or sometimes strictly smaller than C m \ nE (K). We believe 
that pure state cq-channels offer a good testing ground for ideas; we might take encouragement 
from Ii49l . where it was shown that the unambiguous capacity of a pure state cq-graph K equals 
C minE (K). Other interesting K are those that admit only one channel Af, for instance channels 
extremal in the set of cptp maps, cf. f24l , an example of which is the amplitude damping channel; 
in this case, C m i n E(K ) = C E {AT). 

Next, motivated by the fact that both A (K) and £(/i) are SDPs (at least for cq-graphs), we 
ask if there is a manifestly semidefinite programming (or even just convex optimisation) charac¬ 
terisation of 2 Cmm CC? jo ma ke progress, we need at least to understand some properties of an 
optimal Af for given K, and potentially also an optimal input state. 

To offer a concrete approach to the question whether Cmin f.(K) is an achievable rate for pure 
state cq-graph K, we suggest to look at the possible use of conclusive exclusion to implement a 
list-decoding protocol, by excluding more than one state by each outcome - cf. 01. 

List-decoding from approximate decoding? Given state vectors \ifi ),..., |V’iv) G B 
(w.l.o.g. \B\ = N) that are sufficiently orthogonal in the sense that there exists an 
orthonormal basis {|r>i),..., \vn)} of B such that 

Vi \{vi\f>i )\ 2 > 1—e. I For instance, this holds if for each i, ^ \(’4 > i\f > j )\ 2 < e, by lf33| . 

\ jC 

Then, does there exist a subset of N' > Q(A r| ~ r5 ) of these states, {[ ) : j = 1,... N 1 }, 

L < 0(N S ) (5 —> 0 with e —y 0 uniformly) and a POVM (m$ ■ S G such that 

{.3 ■ tyijWsHij) + 0} C S for all 5 G ( [ ^' ] )? 

Note that a positive answer would imply that by preparing ipi and measuring the POVM 
elements Ms, we construct a classical channel/hypergraph T with a* (T) > -f. To see this, ob¬ 
serve that each output S is reached from at most L inputs j, namely those j G S, so the weight 
distribution Wj = f for all i is admissible in the definition of a*(T). Thus we would obtain 

— — N' 

Coef(K) > C 0 F (T) > log j- > (1 — 26) logiV — 0(1), 

which is at least consistent with C(Af) being of the order (1 — e) log N — 0(1), by the existence of 
the basis ..., \vn)} and Fano's inequality. 

By Hausladen et al. Il33l this would imply that we can asymptotically achieve the rate C (Af) = 
Cmin f,(K) as activated feedback-assisted zero-error capacity, where K = span{|^j)(*| : i = 
1,..., N}. It would also imply a new proof of the result of ff49l . since we could use the Shan¬ 
non scheme H45l to get arbitrarily close to the rate log a*(T) by a deterministic list-decoding with 
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constant list size, and then constant activating communication, which we clearly can realize in an 
unambiguous fashion with constant overhead. 

Finally, there is another generalization of the instantaneous feedback considered by Shannon, 
which was dubbed "coherent feedback" in |j5]|, and which consist in the channel environment C 
from the Stinespring isometry V : A ^ B ® C to be handed back to Alice. More like Shannon's 
model, it is completely passive as it doesn't involve any action of Bob's. The resulting zero-error 
capacity, C 0 |i?}(U) is not even obviously a function of K only, nor is it clear whether additional 
free entanglement or free active feedback from Bob to Alice will increase it, though it is clear from 
the Quantum Reverse Shannon Theorem that all of Cq\e)(Y) and its variants are upper bounded 
by C E ( AO- 
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Appendix A: C minE (K) equals the adversarial entanglement-assisted capacity 

Here we give a complete proof of the following theorem from Section [Hi} 

Theorem[l5]For any non-commutative bipartite graph I\ , the adversarial entanglement-assisted clas¬ 
sical capacity of K is given by C*e(K) = C m i n E(K). 

Before proving it, we show a simpler statement on so-called compound channels, which will 
be pivotal for the general proof, however. For a non-commutative bipartite graph K < C{A —? B ), 
and a pure state \<f) G A A' such that f A = <f A ' = p, define X = (1 ® AT) \f) < A (g> B and the sets 
of states, 

Sr,p '■= {(id <g) AT)(j) : IC(Af) < A'} = {cr G S(AB ) : supper < X , a A = p}, 
as well as, for e > 0, 

Sr] p = {er G S(AB ) : 3cr' G Sk, p s.t. ||er — cr'Hi < e}. 

Proposition 16 For any non-commutative bipartite graph K < C(A -a B ), a test state p on A, and 
parameters e > 0 and an integer k, consider the family of cq-channels [W a : Sk -A S(A k ® B k ) : a G 
/ with 


W a : 7r 


(t ® U n )a® k (t ® U n )i. 
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Then, for sufficiently large £, there is an i-block code of N = 2 nR messages (n = k£) and decoding POVM 

(A)£i, ra ^ e 


R > min I (A : B) a — 25, 

o'GcSr'.p 

and uniformly bounded error probability 

1 N 

P„((W")®') = Jf Y, (! - ® " ■ ® W ) a) < c ! 

2=1 

for all a e Here, c < 1 and <5 = 2elog(|A||H|) + f + 2|H| 2 log( \ +|g|) . 

Proof The family of cq-channels [W a : —> S(A k <8> H fc ) : er G S^ p ] generates a compound 
channel, meaning that on block length l, the communicating parties face one of the i.i.d. channels 
(W")^, cr G , but they do not know beforehand which one, so they need to use a code that is 
good for all of them. 

For this we invoke the general result of Bjelakovic and Boche J8|, which states that there are 
such codes with rate 

min x ^ jfL = ffg W ° = ( a 0 A)o- 0fc (l <8> H 7r ) t |^ - k5 


for any 5 > 0 and with error probability uniformly bounded by c , c = c(d) < 1. 

By Lemma |L7| below, 

X (jp* = Wf = (11 ® U n )a 9k (t ® >kI(A: B) a - 2\B\ 2 log (k + \B\), 


and because there is a' G Sk, p with ||er — <7 , ||i < e, Fannes' inequality H29l shows that the rate (over 
n = ki) is 


> min I(A : B) a - 2elog(|A||H|) - | - 2 |H| 2 i^L±iMl _ 

(7 Ay Ay 


and we are done, choosing 5 as advertised. 

We end this proof pointing out a rather nice feature of the code: each message is encoded as 
an 6-tuple of permutations from S P , i i-t 7 r(i) = tt\ (i)... rrffi), which we may view naturally as an 
element of Sk x • • • x Sk C S n , acting on IT" by permuting the tensor factors, each nffi) on its own 
block of k, hence message i is mapped to the state WJjW = (1 <8> Ar(i)) cr ® n ( 1 <g> on A n B n . □ 


Lemma 17 (Cf. Shor 1(4611 ) For any channel A I : C(A) — > C(B) and a state p on A with purification 
\f>) G A A', and let a AB = (id <g> Af)f. Then, for any integer k, 

X (jp* = Wf = (t <g> U n )a® k (t <g> C4) + }) > k I (A : B) a - 2|H| 2 log(fc + |H|), 

where ir ranges over the symmetric group Sk, acting on B k by permuting the tensor factors. 

Proof With the average state 

n AkRk = i ^ (i 0 Ar)ff® fc ( 1 ® c/ 7r ) t , 

ir£S k 
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we have 

X ({^, (1 <8 U n )a® k ( 1 ® £4)t J) = - S(u® fc ) 

= S(n Ak ) + S(n Bk ) - J(A fc : S fc )n - S(a® k ) 

= kI(A:B) a -I(A k :B%, 


where we have used that all ensemble members are just unitary transformed versions of a® k 
(first line), the definition of the mutual information (second line), the fact that 0, Ak = (< ? A r k and 
n Bk = (cr B )® k as well as additivity of the von Neumann entropy (third line). 

Now we use the representation theory of Sk acting on B k to bound the mutual information 
remaining: From Schur-Weyl duality l3ll it is known that 

B k = ($Q\®P x , 

A 


where A are Young diagrams with at most b = \B\ rows, P\ are the corresponding irreps of .S'/,, and 
Q\ is the multiplicity space, which is an irrep of the commutant representation, SU(6). With the 
maximally mixed state t\ on P\, Schur's Lemma implies that 


n AkBk 


© Sa^’a 


e Qi 


Now observe that A A> can by local operations B k -H- D := 0 A Q b x be reversibly transformed 
into 

^ AkD = ©^f Q \ 

A 


hence 


I(A k : B k ) n = I(A k :D) U < 2 log \D\ < 2 b 2 log (k + b). 

The latter because it is known that there are only L < (k + l) b Young diagrams and each SU (b) 
irrep has dimension \Q b x \ < M = (k + b)^ b2 , hence \D\ < LAI = (k + 1 ) b (k + fr)^ 2 < (k + b ) b ~, as 
we only need to consider the case b > 2. □ 

Proof (of Theorem[l5j First we show the upper bound, to be precise the strong converse. Because 
among the eligible channels is J\f® n with JC(J\f) < K attaining the minimum in C in]n e(B), we 
see immediately that C*e(K) < Ce(N) = Cm\ n e(K). In fact, the Quantum Reverse Shannon 
Theorem for J\f® n [j5}|Zl implies the strong converse as well, i.e. for all e < 1, 

limsup —N(n, e; I\) < C E {M) = C minE (K). 

n—yoo Tl 

A direct proof of this can be found in |[26l (see also Il27l ). Furthermore, Bowen ITTOH (alterna¬ 
tively again the Quantum Reverse Shannon Theorem) showed that feedback does not increase 
the entanglement-assisted capacity. 

It remains to show achievability of C m i nE (K ); for this it will be enough to show that for any 
test state p on A, C* E (K) > min^^ <K I{p] AT), by exhibiting a sequence of codes with this rate 
and error probability going to 0, exponentially in n. Choose a purification \4>) AA ' of p and let 
Alice and Bob share 4>® n as well as a maximally entangled state of Schmidt rank n\, which is 
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measured by both parties in the computational basis to obtain a shared random permutation 
t <G S n . Alice's encoding will be to subject her n input A'-systems to a permutation 7 r(i) for each 
message i = 1 ,,N, then apply r and send the resulting state through the channel ; Bob 
will apply the permutation r~ 1 to his n output 5-systems. The state this prepares for Bob is 


UJ(l 


\A n B n 


= ^[(id® A^))((i ® U T U K(i) )^ n (t ® AAr«) + ) 


reSn 


= (1 ® Ar(i)) [(id (8) A (n Vi (1 ® U K{i] ) t 


(1 ® Ur) 


with the permutation-symmetrized channel 

W = ^ E UlN^(U T pUl)U T . 

T&Sn 


Note that as < K® n , the same holds for The permutations tt_ (i) form a code for the 

compound channel 

W* = (1 <g> U n )a® k (K ® Urf : a AB e 

according to Proposition [16] and its proof; here, n = k£, and we will determine k and e later. Bob 
will use the very decoding POVM ( A) from the same proposition. 

To analyze the performance of this strategy, we apply the Constrained Postselection Lemma [18] 

to the permutation-symmetric state cr^ n ) = (id(g>A/^ ,t ' ) )</> lg)n , X = (l<g>/L)|</l>) < A(g)5and7L = TV#: 


ff W<(n+ f) 3 IAW 


d a<j® n F(a A ,p A ) 2n , 


where the integral is over states < 7 AB supported on X < AB. We split the integral into two parts, 
a first where F(a A , p A ) < 1 — a and a second one where F(a \ p A ) > 1 — a. Choosing a small 
enough ensures that those a AB are in . Thus, 


<r (n) < (n + 1) 3|A|2|S|2 (1 - a ) 2n a 0 + (n + 1 ) 3|A|2|i?|2 [ daa® n , 

J F(cr A ,p A )>l-a 

with some state (Tq. At this point we can evaluate the error probability: 


P ” = N 


1 N 

- ^Tr((ll ® 1 ® C4 (i) ) + (1 - A)) 


i= 1 


< poly(n) 


(1 - a) 2n + max Tr((l ® A (i) V(1 ® A (i) )t(l - A)) 


a£S 


(0 


< poly(n) ((1 — a) 2n + c n ^ k ), 


showing that for every n and e the error probability goes to zero exponentially - in fact, at the 
same rate as the corresponding compound channel, except for the additional term (1 — a) 2n . 

The rate, according to Proposition [16] is > I(p\ Af) — 25, where 5 = 2elog(|A||5|) + 

| + 2| A?| 2 hrs(_ALlAil can b e made arbitrarily small by choosing e small enough and k large enough. 

□ 
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Remark Along the same lines, the use of permutation-symmetrization and the Postselection 
Lemma allow to give a new proof of the coding theorem for arbitrarily varying cq-channels Q, 
by reducing it to a compound cq-channel [SJ, cf. also Il36l . 

Observe however that what we treated here is not an "arbitarily varying quantum channel" in 
any sense previously considered EHH, going beyond the model in ff36H . too. 


Appendix B: A Constrained Post-Selection Lemma 

Here we show the following extension of the main technical result of Ifl3l (albeit with a worse 
polynomial prefactor). 

Lemma 18 For given Hilbert space X with dimension d, denote by dcr the measure on the quantum states 
S(X) obtained by drawing a pure state from X®X' uniformly at random (i.e.,from the unitarily invariant 
probability measure) and tracing out X'. 

Then, for any S n -mvariant state p^ on X n , 



The measure dcr is universal in the sense that it depends only on the space X. 

Furthermore, let Ft : C(X) —> C(Y) be a cptp map, q e S(Y) a state. Then, for every S n -invariant 
state p^ on X n with TZ 0n (p^) = q® n , 

pH < ( n + i) 3 d 2 J daa m F(n(a),q) 2n . 

Note that the right hand side depends only on X, 7 Z, rj and n. 

Here, F(£, r/) = || x/Cv 7 ^?!! 1 * s fidelity between (mixed) states € S(X) Il30ll35 , 501. 

Remark Note that in Fl^ n \ the contribution of states cr with F{lZ(a), r/j < 1 — e is exponentially 
small in n. I.e., for a symmetric state with an additional constraint, expressed by TZ and q, the 
universal de Finetti state from [fl3l may be chosen in such a way that almost all its contributions 
also approximately obey the constraint. 

Proof Denoting the uniform (i.e. unitarily invariant) probability measure over pure states ( = 
ICXCI on X (g> X' by d(, it is well known that 

/dec ' = ,„ +d 2_n n Sym n (X(g)X')) 

J l cP-l ) 

with ^Sym rl (.Y(g>A' , ) denoting the projector onto the (Bose) symmetric subspace of (A' <g) X')® n . The 
reason is that the latter is an irrep of the [/^-representation for U G SU(d 2 ), so Schur's Lemma 
applies. Now we apply Caratheodory's Theorem, which says that cl/ can be convex-decomposed 
into measures with finite support, more precisely ensembles { <h . G: }f= i / with D = ( n fp _^ 1 ) < 
(n + 1 )' /2 , the dimension of the Bose symmetric subspace of (X 0 X')® n , and 

= ]^ n Sym n (X®.Y')- 


For the moment we shall focus on one of these measures/ensembles. 
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It is also well known that one can purify p in a Bose symmetric way, i.e. p (n> = T\x r " ■r ' n \ 
with = \p^)[p ( ' n) \ a pure state supported on the Bose symmetric subspace. Thus, with the 
operator A := |Ci)' 8 ’ n (f|, 


_ n Sym n(x,g,X') < / 7( '" ) nSym n (X(g)J5s:') 

= D 2 ^ mj (f n <p {n) C? n 


V 


= D " A ^ 
< D 4 A (^qfm(Ci\® n <P {n) ^ 


< D 


< D 3 £ qi (f n F ((Trx' C i)® n , P (n) 


where in the fourth line we have used Hayashi's pinching inequality |{34ll . and in the fifth q, < j^; 
in line six we have invoked the monotonicity of the fidelity under cptp maps, here the partial 
trace, as well as Tr^/™ tp^ = p^ n \ 

Now we remember that {(/,, } was just one of the Caratheodory components of the uniform 

measure d£, so by convex combination, 

<p( n ) < D 3 f dCC® n ^ ((Tf-.v'C )® n ,P H Y , 


hence by partial trace over X ,n , and recalling the definition of dcr, we arrive at 

pH <D 3 J d aa® n F(a® n ,p^) 2 . 

To obtain the second bound, we apply the map lZ 0n to the states inside the above fidelity; by 
monotonicity of the fidelity once more, 

F(a® n ,p( n) ) < F [n® n {a® n ),n® n {p {n) )) 

= F 

= F(K{a),ri) n , 


as desired. □ 

Remark It is the trick to sandwich the Bose-symmetric state p t/n - between symmetric subspace 
projectors - rather than bounding it directly by that projector -, which allows the introduction of 
fidelities between the state and "test" product states. 

Here we have used this to enforce a linear constraint valid for p (n> on the components of the 
de Finetti state on the right hand side. It turns out, perhaps unsurprisingly, that also other con¬ 
vex constraints (with a "good" behaviour linking n = 1 with the general case) are amenable to 
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the same treatment, for instance membership in the convex set of separable states for a multi¬ 
partite space X = X\ tg> ■ ■ ■ tg> X k , and other similar sets, or even non-convex constraints. Such 
generalizations and their applications are discussed in Il38| . 
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